Word Sense Disambiguation Using Vectors of Co-occurrence Information

نویسندگان

  • Saim Shin
  • Yong-Seok Choi
  • Key-Sun Choi
چکیده

This paper reports on the word sense disambiguation of Korean noun by using co-occurrence information in context. For a given noun, its local contextual word distribution is not enough to express their semantic characteristics for noun sense disambiguation. This paper proposes a cluster-based sense as a base vector. Contextual noise is removed by a term weighting method, and hypernyms of remaining contextual words are used to modify the base vector so as to enhance the discrimination. This hypernym is extracted from the dictionary definitional pattern with some loss of precision. The most dominant sense in the training data set is used when the failed sense disambiguation. The Korean SENSEVAL test suite is used for this experimentation and our method leads up to 42% precision improvement.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using WordNet-Based Context Vectors To Estimate The Semantic Relatedness Of Concepts

In this paper, we introduce a WordNetbased measure of semantic relatedness by combining the structure and content of WordNet with co–occurrence information derived from raw text. We use the co–occurrence information along with the WordNet definitions to build gloss vectors corresponding to each concept in WordNet. Numeric scores of relatedness are assigned to a pair of concepts by measuring the...

متن کامل

Latent Semantic Word Sense Disambiguation Using Global Co-occurrence Information

In this paper, I propose a novel word sense disambiguation method based on the global co-occurrence information using NMF. When I calculate the dependency relation matrix, the existing method tends to produce very sparse co-occurrence matrix from a small training set. Therefore, the NMF algorithm sometimes does not converge to desired solutions. To obtain a large number of co-occurrence relatio...

متن کامل

Word Sense Disambiguation Using Neural Networks with Concept Co-occurrence Information

Most previous word sense disambiguation approaches based on neural networks were impractical due to their huge feature set size. We propose a method for resolving word sense ambiguity using neural networks with refined concept co-occurrence information (CCI) as features. Using CCI refinement processing, we reduce the number of features of the network to a practical size. We also show that word ...

متن کامل

Trans-EZ at NTCIR-2 : Synset Co-occurrence Method for English-Chinese Cross-Lingual Information Retrieval

In this paper, a new method for English-Chinese cross-lingual information retrieval is proposed and evaluated in NTCIR-II project. We use the bilingual resources and contextual information to deal with the word sense disambiguation (WSD) and translation disambiguation for query translation. An EnglishChinese WordNet and a synset co-occurrence model are adopted to solve the problem of word sense...

متن کامل

Improving Word Sense Discrimination with Gloss Augmented Feature Vectors

This paper presents a method of unsupervised word sense discrimination that augments co–occurrence feature vectors derived from raw untagged corpora with information from the glosses found in a machine readable dictionary. Each content word that occurs in the context of a target word to be discriminated is represented by a co-occurrence feature vector. Each of these vectors is augmented with th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001